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Abstract — We consider the problem of rank aggregation 
based on new distance measures derived through axiomatic 
approaches and based on score-based methods. In the first 
scenario, we derive novel distance measures that allow for 
discriminating between the ranking process of highest and 
lowest ranked elements in the list. These distance functions 
represent weighted versions of Kendall's r measure and may 
be computed efficiently in polynomial time. Furthermore, we 
describe how such axiomatic approaches may be extended to the 
study of score-based aggregation and present the first analysis 
of distributed vote aggregation over networks. 

I. Introduction 

Rank aggregation is a classical problem frequently en- 
countered in social sciences, web search and Internet ser- 
vice analysis, expert opinion and voting theory 0]-||7]. The 
problem can be succinctly described as follows: a set of 
"voters" or "experts" is presented with a set of distinguishable 
entities (objects, individuals, movies), typically represented 
by the set {1, 2, • • • , n}. The voters' task is to arrange the 
entities in decreasing order of preference and pass on their 
ordered lists to an aggregator. The aggregator outputs a 
single preference list used as a representative of all voters. 
Hence, one has to be able to adequately measure the quality 
of representation made by a vote aggregator. Two distinct 
analytical rank aggregation methods were proposed so far, 
namely, distance-based methods and score-(position-)based 
methods. In the first case, the quality of the aggregate is 
measured via a distance function that describes how close 
the aggregate is to each individual vote. In the second case, 
the aggregate is obtained by computing a score for each 
ranked entity and then arranging the entities based on their 
score. Well known distance measures include Kendall's r and 
Spearman's Footrule [8|. 

The goal of this work is to propose two novel research 
directions in rank aggregation: one, which builds upon the 
existing work of distance-based aggregation, but expands the 
scope and applicability of vote-distances; and another, which 
sets the stage for analyzing score-based vote aggregations 
over networks. The results presented in the paper include a 
new set of voting-fairness axioms that lead to distance mea- 
sures previously unknown in literature, as well as an analysis 
of consensus in distributed score-based voting systems. 

Our work on aggregation distance analysis is motivated 
by the following observations: a) in many applications, the 
top of the ranking is more important than the bottom and so 
changes to the top of the list must result in a more significant 
change in the aggregate ranking than changes to the bottom 
of the list; b) ranked entities may have different degrees of 
similarity and often the goal is to find the most diverse, yet 



highest ranked entities. Hence, swapping elements that are 
similar should be penalized less than swapping those that 
are not. To the best of the authors' knowledge, the work 
of Sculley [5| represents the only method proposed so far 
for handling similarity in rank aggregation. Sculley presents 
an aggregation method, based on the use of Markov chains 
first introduced by Dwork et. al., with the goal of assigning 
similar ranks to similar items. A handful of results are known 
for rank aggregation distances that address the problem of 
positional relevance, i.e. the significance of the top versus the 
bottom of the ranking (7). In this context, we introduce the 
notions of weighted Kendall distance and weighted Cayley 
distance, both capable of addressing the top versus bottom 
ranking issue, and provide axiomatic characterizations for 
these distance measures. 

The work on vote aggregation over networks considers 
the issue of reaching consensus about the aggregate ranking 
in an arbitrary network, either through local interactions 
or based on a gossip algorithms. The assumption is that 
voters are connected through a social network that allows 
them to adjust their votes based on the opinions of their 
neighbors or randomly chosen network nodes, or even based 
on exogenous opinions. For a special type of score-based 
scheme - Borda's rule - we show that convergence to a vote 
consensus occurs and we determine the rate of convergence. 
The analysis of rank aggregation over networks for distance - 
based aggregation rules, and in particular for Kendall's r 
and the weighted Kendall distance, is postponed to the full 
version of the paper. 

The paper is organized as follows. An overview of relevant 
concepts, definitions, and terminology is presented in Sec- 
tion II. Weighted Kendall distance measures and extensions 
thereof, as well their axiomatic definitions, are presented in 
Sections III and IV. Section V is devoted to the analysis of 
gossip algorithms for rank aggregation. 

II. Preliminaries 

Suppose one is given a set £ = {<Jx,a-z,- • • , <7 m } of 
rankings, where each ranking a represents a permutation in 
S n , the symmetric group of order n. 

Given a distance function d over the permutations in §„, 
the distance-based aggregation problem can be stated as 

m 
min^TdKa,). 

" i=l 

In words, the goal is to find a ranking 7r with minimum cu- 
mulative distance from S, Clearly, the choice of the distance 
function d is an important feature for all distance-based rank 



aggregation methods. Many distance measures in use were 
derived by starting from a reasonable set of axioms and then 
showing that the given distance measure is a unique solution 
under the given set of axiomo A distance function derived 
in this manner is Kendall's r distance, based on Kemeny's 
axioms QJ. 

On the other hand, score-based methods are centered 
around aggregators that assign scores to objects based on 
their positions in the rankings of E, Objects are then sorted 
according to their scores to obtain the aggregate ranking. One 
of the best known rules in this family is Borda's aggregation 
rule, introduced by Jean-Charles de Borda [10] wherein, for 
each ranking au object j receives score bj — a^ (j). The 
average score of object j is V = — YHILi K- ^ s a gg re S ate 
ranking is obtained by assigning the highest rank to the object 
with the lowest average score, the second highest rank to 
the object with the second lowest average score and so on. 
Borda's method also has an axiomatic underpinning: in the 
context of social choice functions, Young JTD presented a 
set of axioms that showed that Borda's rule is the unique 
social choice function that satisfies the given axioms. A 
social choice function is a rule indicating a set of winners 
when votes are given as rankings. Note that although similar, 
a social choice function differs from an aggregation rule; 
while a social choice function returns a set of winners, an 
aggregation rule ranks all objects. 

In what follows, we introduce the notation used throughout 
the paper and provide a novel proof for the uniqueness of 
Kendall's r distance function for a set of reduced Kemeny 
axioms IB. 

Let e = 12 • • • n denote the identity permutation (ranking). 

Definition 1. A transposition of two elements a, b G [n] 
in a permutation 7r is the swap of elements in positions a 
and b, and is denoted by (a b). In general, we reserve the 
notation r for an arbitrary transposition and when there is no 
confusion, we consider a transposition to be a permutation 
itself. If \a — b\ = 1, the transposition is referred to as an 
adjacent transposition. 

It is well known that any permutation may be reduced to 
e via transpositions or adjacent transpositions. The former 
process is referred to as sorting, while the later is known 
as sorting with adjacent transpositions. The smallest number 
of adjacent transpositions needed to sort a permutation n is 
known as the inversion number of the permutation. Equiv- 
alently, the corresponding distance d(e, w) is known as the 
Kendall's r distance. The Kendall's t can be computed in 
time 0(n 2 ). 

We also find the following set useful in our analysis, 

A{-k,(j) = {(riv ,r m ) : 

m e N, a = itti ■ ■ ■ T m , n = (oi 0, + 1) , i € [m] } 

i.e., the set of all sequences of adjacent transpositions that 
transform n into a. 



For a ranking -k G §„ and a,b G [n], n is said to rank 
a before b if 7r _1 (a) < 7r _1 (fe). We denote this relationship 
as a < T b. Two rankings 7r and a agree on a pair {a, b) of 
elements if both rank a before b or both rank b before a. 
Furthermore, the two rankings it and a disagree on the pair 
{a, b} if one ranks a before b and the other ranks b before 
a. 

For example, consider 7r = 1234 and a — 4213. We have 
that 4 < a 1 and that 7r and a agree for {2, 3} but disagree 
for {1,2}. 

Definition 2. A ranking u> is said to be between two 
rankings 7r and a, denoted by tt-lo-o-, if for each pair of 
elements {a, b}, u either agrees with tt or a (or both). The 
rankings ttq, • • • , Km are sa id t° be on a line, denoted by 
ttq-tti-- ■ --7r m , if for every i,j, and k for which < i < 
j < k < m, we have TTi—TTj—TTk- 

The basis of our subsequent analysis is the following set 
of axioms required for a rank aggregation measure, first 
introduced by Kemeny JT): 

Axioms I 

1) d is a metric. 

2) d is left-invariant, i.e. d(a"7r, auS) = d(7r,cj), for any 
ir,a,u G §„. In words, relabeling of objects should 
not change the distance between permutations. 

3) For any w, a, and u>, d(ir, a) — d(n,uj) + d(u),a) if 
and only if u> is between it and a. This axiom may be 
viewed through a geometric lens: the triangle inequality 
has to be satisfied for all points that lie on a "straight 
line" between 7r and a. 

4) The smallest positive distance is one. This axiom is 
only used for normalization purposes. 

Kemeny's original exposition included a fifth axiom which 
we restate for completeness: If two rankings 7r and a agree 
except for a segment of k elements, the position of the 
segment within the ranking is not important. Here, a segment 
represents a set of objects that are ranked consecutively - i.e., 
a substring of the permutation. As an example, this axiom 
implies that 

d(123 456,123 654) = d(l 456 23,1 654 23) 

where the segment is denoted by braces. This axiom is 
redundant since an equally strong statement follows from the 
other four axioms, as we demonstrate below. Our alternative 
proof of Kemeny's result also reveals a simple method 
for generalizing the axioms in order to arrive at weighted 
distance measures. 

Lemma 3. For any d that satisfies Axioms I, and for any set 

of permutations ttq, ■ ■ ■ , 7r m such that wq— 7Ti— • • •— n m , one 
has 



d(7r ,7r m ) = 2jd(7rfc-i)7Tfc) 



fc=i 



191. 



'This is to be contrasted with the celebrated Arrow's impossibility theorem 



Proof: The lemma follows from Axiom 1.3 by 
induction. ■ 



Lemma 4. For any d that satisfies Axioms I, we have that 
d((tt + l),e) = d((12),e),te [n-1]. 

Proof: We show that d ((23) , e) = d ((12) , e). Repeat- 
ing the same argument used for proving this special case 
gives d ((* i + 1) , e) = d ((i - 1 i) , e) = • • • = d ((12) , e). 

To show that d ((23) , e) = d ((12) , e), we evaluate d(%, e) 
in two ways, where we choose it = 32145 ■ • • n. 

On the one hand, note that ir-uj-rj-e where cj = 7r(12) = 
23145 • ■ • n and 77 = w(23) = 21345 • ■ • n. As a result, 

d(7r,e) = d(7r,w) + d(co, rj) + d(?y, e) 

= d(w _1 7r, e) + d(7y _1 o;, e) + d(»j, e) 

= d((12) > e) + d((23),c) + d((12),e) (1) 

where (a) follows from Lemma [3] 

On the other hand, note that %-a-(3-e where a — 7r(23) = 
31245 • ■ • n and /3 = a(12) = 13245 • ■ • n. For this case, 

d(7r, e) = d(7r, a) + d(a, /?) + d(/3, e) 

= d(a" 1 7r, e) + d09 _1 a, e) + d(/3, e) 

= d((23),e)+d((12),e)+d((23) J e). (2) 

Expressions (Q3 and (O imply that d((23),e) = 
d((12),e). ■ 

Lemma 5. For any d that satisfies Axioms I, d(j, e) equals 
the minimum number of adjacent transpositions required to 
transform 7 into e. 



This observation completes the proof of the lemma, since it 
implies that 



Proof: Let 

L(jr, a) = {(n, • ■ ■ , r m ) G ^4(7r, cr) : 7r-7rri-7rrir 2 - 



-a} 



be the subset of A(tt, a) consisting of sequences of transpo- 
sitions that transform it to a by passing through a line. Let m 
be the minimum number of adjacent transpositions that trans- 
form 7 into e. Furthermore, let (t\,t<2,--- , r m ) € ^.(7, e) 
and define 7^ = 7x1 •••n,i = 0, • • • , to, with 70 = 7 and 
7m = e. 

First, we show that 70-71- • ■ ■ -7m, that is, 
(ti,T2,-- ■ ,r m ) 6 £(7, e). Suppose this were not the 
case. Then, there would exist i < j < k such that 7i,7j, 
and 7fc are not on a line, and thus, there would exists a 
pair {r,s} for which 7^ disagrees with both 7, and 7^.. 
Hence, there would be two transpositions, Ty and tj', with 
i < i' < j an d i < j' < fc that swap r and s. We could 
in this case remove -tv and tj> from (to, • • • , r m ) to obtain 
(r ,' •• ,Ti/_i,Ti/ +!,••• ,Tj/_i,Tj/ + i,r m ) G A(7,e) with 
length to — 2. This contradicts the optimality of the choice 
of to. Hence, (ti,T2,--- , r m ) G L(j,e). Then Lemma [3] 
implies that 

m 

d(7,e) = 53d(7-i,c). (3) 

i=l 

From (O, it is clear that the minimum positive distance 
from the identity is obtained by some adjacent transpositions. 
But, Lemma |4] states that all adjacent transpositions have 
the same distance from the identity. Hence, from Axiom 
1.4, we have d(r, e) = 1 for all adjacent transposition r, 



d( 7 , e ) = ^d(r l , e ) = ^l = 



i=l 



i = l 



Lemma 6. For any d that satisfies Axioms I, we have 

d(7r,cr) = min{m : (n,--- ,r m ) G ^4(7r,cr)} . 

That is, d(n, a) equals the minimum number of adjacent 
transpositions required to transform it into a. 

Proof: We have (n, • • ■ , r m ) G A(7r, a) if and only if 
(t1) - "" j T m) G A(a~ 1 ir, e). Furthermore, left-invariance of 
d implies that d(x, a) = d(cr _1 7r, e). Hence, 

d(ir,a) = d(cr _1 7r, e) 

= min {to : (n, • • • ,T TO ) G A(a^ 1 n, e)} 
= min {m : (ri, • • • , r m ) G A(7r, cr)} 

where the second equality follows from Lemma ■ 

Theorem 7. The unique distance d that satisfies Axioms I is 

d T (7r, a) = min {to : (n,--- ,r m ) G A(7T,cr)} . 

Proof: The fact that d r satisfies Axioms I can be easily 
verified. Uniqueness follows from Lemma [6] ■ 

III. Weighted Kendall Distance 

Our proof of the uniqueness of Kendall's r distance under 
Axioms I reveals an important insight: Kendall's measure 
arises due to the fact that adjacent transpositions have uni- 
form costs, which is a consequence of the betweenness prop- 
erty of one of the axioms. If one had a ranking problem in 
which costs of transpositions either depended on the elements 
involved or their locations, the uniformity assumption had to 
be changed. As we show below, a way to achieve this goal is 
to redefine the axioms in terms of the betweenness property. 

Axioms II 

1) d is a pseudo-metric, i.e. a generalized metric in which 
two distinct points may be at zero distance. 

2) d is left-invariant. 

3) For any it, a disagreeing for more than one pair of 
elements, there exists a ui such that d(-7r, a) = d(7r, u)+ 
d(w,cr). 

Lemma 8. For any distance d that satisfies Axioms II, and 
for distinct n and a, we have 

m 

d(n,a)= min Vd(r l7 e). 

(T ,-,T m )eA(7T,O-)f— f 
2—1 

Proof: First, suppose that 7r and a disagree on one pair 
of elements. Then, we have a — 7r(a a + 1) for some a G 
[n — 1]. For each (to,--- ,T m ) G A(%, a), there exists an 
index j such that Tj = (a a + 1) and thus 



i=i 



d(r»,e) > d(r,-,e) = d((a a+l),e) 



implying 

m 

min V"d(ri,e) > d((o a+l),e). (4) 

z— 1 

On the other hand, since ((a a + 1)) 6 ^4(7r, <t), 

m 

min Va(Tj,e) < d((o a + 1) , e). (5) 

(r ,-,T m )e^(7r,o-)^ 

From © and ©, 

m 

min N dfrj, e) = d((a a + 1) , e) = d(7r, a) 

(To>-,T m )eA(ir,<T)*-' 

2 — 1 

where the last equality follows from the left-invariance of d. 
Next, suppose tt and a disagree for more than one pair 
of elements. A sequential application of Axiom II. 3 implies 
that 

m 

d(7r,cr)= min Vd(Ti,e), 

(ro,---,T m )eA{TT.a )*—' 
z— 1 

which proves the claimed result. ■ 

Definition 9. A distance d is termed a weighted Kendall 
distance if there is a nonnegative weight function ip over the 
set of adjacent transpositions such that 

m 

d(7r,er)= min > 03 r . 

(To,-,T m )e^(7r,<T)^ 

where cp T is the weight assigned to transposition r by p. 

Note that a weighted Kendall distance is completely deter- 
mined by its weight function p. 

Theorem 10. A distance d satisfies Axioms II if and only if 
it is a weighted Kendall distance. 

Proof: It follows immediately from Lemma [8] that a dis- 
tance d satisfying Axioms II is a weighted Kendall distance 
by letting 

p T =d(r,e) 

for every adjacent transposition r. 

The proof of the converse is omitted since it is easy to 
verify that a weighted Kendall distance satisfies Axioms II. 

■ 

The weighted Kendall distance provides a natural solution 
for issues related to the importance of the top-ranked candi- 
dates. Due to space limitations, we refer the reader interested 
in other applications of weighted distances to our recent work 

urn 

Computing the Weighted Kendall Distance 

Computing the weighted Kendall distance between two 
rankings for general weight functions is not a task as straight- 
forward as computing the Kendall's r distance. However, 
in what follows, we show that for an important class of 
weight functions - termed "monotonic" weight functions - 
the weighted Kendall distance can be computed efficiently. 

Definition 11. A weight function <p : A n — > R + , where A„ 
is the set of adjacent transpositions in §„, is decreasing if 



i > j implies that tp^ i+1 ) < <py j+i). Increasing weight 
functions are defined similarly. 

Decreasing weight functions are important as they can be 
used to model the significance of the top of the ranking by 
assigning higher weights to transpositions at the top of the 
list. 

Suppose a transformation r = (n, • • • , r m ) of length m 
transforms tt into a. The transformation may be viewed as a 
sequence of moves of elements indexed by i, i = 1, . . . , m, 
from position 7r _1 (i) to position er -1 ^). Let the walk along 
which element i is moved by transformation r be denoted 



by p l 



Pi 



,P 



b^l+i 



where 



P' 



is the length of 



the walk p l ' T . 

We investigate the lengths of the walks p l ' T ,i £ [n]. Let 
X i (-K 1 a) be the set consisting of elements j £ [n] such that 7r 
and a disagree on the pair {i,j}. Furthermore, let Ii(n, a) = 
\Ti(n, cr)\. In the transformation r, all elements of Zj(7r, a) 
must be swapped with i by some Tfe, fc £ [m]. Each such swap 
contributes length one to the walk p l ' T and thus, \p l ' T \ > 
Ii(n,a). 

It is easy to see that 

n 1 \P- T \ 

d M (7r,er) = min > — > 03/ «.t «,r \. 
Considering individual walks, we may thus write 



n IP I 



1=1 j = l 



(6) 



where, for each i, Pi is the set of all walks of length 
Ii(n,a) starting from 7r _1 (i) and ending at cr _1 (i). Since 
p is decreasing, the minimum is attained by the walks 
p 4 >* = (tt- 1 (t), • • • , £i - 1,4 ti - 1, • • ■ , o-H*)) where ^ 
is the solution to the equation 

£ l -TT- 1 (i)+l l -(J- 1 (i)=I l (TT,0) 

and thus £ { = (7r _1 (i) + cr _1 (i) +ij(7r,cr)) /2. 

We show next that there exists a transformation r* such 
that p l,T = p 1 '* and thus equality in (0 can be achieved. 
The transformation in question, r*, transforms 7r into cr in n 
rounds. In round i, r* moves j through a sequence of adjacent 
transpositions from position 7r _1 (i) to position er -1 ^). It can 
be seen that, for each i, p z ' T = (tt -1 ^). • • • ,t\ - 1,^,^ - 
1, • ■ ■ , o p_1 (i)) for some ^. Since each transposition in r 
decreases the number of inversions by one, l' { also satisfies 
the equation 

implying that t\ — ii and thus p l ' T = p 1 '* . Consequently, 
one has the following proposition. 

Proposition 12. For rankings tt, a £ S n , we have 



d v (vr,cr) 



n / U-l 



<PU i+i) 



E 



vo j+i) 



u=t- x W 



jw-'(i) 



where £ 4 = (tt^ 1 (i) + o~ l (i) + I t (tt, a)) /2. 

Example 13. Consider the rankings n — 4312 and e = 1234 
and a decreasing weight function </?, We have Ii(ir,e) = 2 
for i = 1, 2 and Ii(7r, e) = 3 for i = 3, 4. Furthermore, 



* = *±i±^=3, 



p 1 »* = (3,2,l), 



^ = 4 + 2 + 2 = ^ p2,* = (4(3)2)) 



2+3+3 „ 
4 = = = 4, 



* 1+4+3 

^4 = r = 4, 



p 3 <* = (2,3,4,3), 
p 4 '* = (1,2,3,4). 



The minimum weight transformation is 

(32)^,(43M32),(43) 

1 2 3 

where the numbers under the braces are the element that is 
moved by the indicated transpositions. The distance between 
tt and e is 

A v (lT, e) = lf(i2) + 2^(23) + 2(^(34). 

Note that the result above implies that at least for one class 
of interesting weight functions that capture the importance of 
the position in the ranking, the computation of the distance is 
of the same order of complexity as that of standard Kendall's 
t distance. Hence, distance computation does not represent a 
bottleneck for the employment of weighted distance metrics. 

IV. Generalizing Kemeny's Approach 

We proceed by showing how Kemeny's axiomatic ap- 
proach may be extended further to introduce a number of 
new distances metrics useful in different ranking scenarios. 

The first distance applies when only certain subsets of 
transpositions are allowed - for example, when only elements 
of a class may be reordered to obtain an aggregated ranking. 

Definition 14. Consider a subset G = {<?i, • • • ,g m } of § n 
such that g E G implies that g^ 1 E G. Rankings tt and u 
are G— adjacent if there exist g E G such that it = ag. 

A G— transformation of tt into a is a vector 
(Sii • ' ■ ,9k), k E N, with gi E G.i E [k], such 
that a = irgig2 ■ • • 9k where k is the length of the 
G— transformation. The set of G— transformations of tt into 
a is denoted by Aq(tt, a). A minimum G— transformation 
is a G— transformation of minimum length. 

Furthermore, u is said to be G— between tt and a if there 
exists a minimal transformation (g\,- ■ • ,gt) of tt into a such 
that u = agi ■ ■ ■ gj for some j E [k] . 

Definition 15. For a subset G of §„, a function d : §„ — >• 
[0, oo] is said to be a uniform G— distance if 

1) d is a metric. 

2) d is left-invariant. 

3) For any tt, a E §„, if ui is between tt and a, then 
d(7T,cr) = d(7T, ui) +d(w,<j). 

4) The smallest positive distance is one. 



Definition [3] also applies to G— betweenness and can be 
restated as follows. 

Lemma 16. For a uniform G— distance d, and for 

i"0> " ' ' j I'm sucn tnat ^a-^i- ■ • • —Km, we have 



d(7ro,7r m ) = 2jd(7rfc_i,7Tfc) 



fc=i 
Remark 17. For some choices of G, as in Lemma [4] and 
Lemma|20]in the next section, one may show that all elements 
of G have distance one from the identity. For such G, it 
is easy to see that the uniform G— distance d exists and is 
unique, with 

d(7T,cr) = min{m : (n,--- ,r m ) E A g (tt,<j)} . 

m 

Definition 18. For a subset G of § n , a function d : §„ — >• 
[0, oo] is said to be a weighted G— distance if 

1) d is a pseudo-metric. 

2) d is left-invariant. 

3) For any tt, a E §„, if tt and a are not G— adjacent, 
there exists a u> between tt and a, distinct from both, 
such that d(n, a) — d(ir,uj) + d(ui,a). 

Remark 19. It is straightforward to see that the weighted 
G— distance d exists and is uniquely determined by the values 
d(g,e),g E G as 



d(TT,a)= min ,J2 d( - 



n,e 



where the minimum is taken over all G— transformations 

(n, • • • ,r m ) of tt into a. 

As an example, let G from Definitions Q3] and [18] be the set 

T„ = {(ab) :a,bE [n],a^b} 

of all transpositions. 

The following lemma states that for a uniform 
T„— distance, all transpositions have equal distance 
from identity. 

Lemma 20. For a uniform T n — distance d, we have 
d((o6),e)=d((cd),e) 

for all transpositions (ab) and (cd). 

Proof: For {a, 6} = {c, d}, the lemma is obvious. We 
prove the lemma for the case that a, b, c, and d are distinct. 
A similar argument applies when {a, 6} and {c, d} have one 
element in common. The argument parallels that of Lemma 

El 

Let tt = (abed), to = (ad)n, r\ = (cd)tu and note that 
e = (bc)rj. Since, ir-u)-r\-e by Lemma [T6l and left-invariance 
of d, we have 

d(7r, e) = d ((ad), e) + d ((cd), e) + d ((6c), e) . (7) 

Similarly, let a — (bc)ir, (3 = (ab)a, and note that e = 
(ad) /3. This shows 

d(7r, e) = d ((be), e) + d ((ab), e) + d ((ad), e) . (8) 



Equating the right-hand-sids of (|7) and © yields 

d((o6),e) = d((crf),e). ■ 

By combining Remark [17] and Lemma [20] we arrive at the 
following theorem. 

Theorem 21. The uniform T„— distance exists and is unique. 
Namely, 

d(7r,cr) = min{m : (n, • • • ,T m ) £ j4 T „(7r,cr)} , 

m 

(commonly known as Cayley's distance) is the unique 
T„— distance. 

Furthermore, Remark [19] leads to the following theorem. 

Theorem 22. The weighted T„— distance d e^wfs ant/ is 
uniquely determined by the values d(r, e), r £ T„ as 



d(7T,cr) 



(ti,— ,T m )e>l T „(7r,<T) 



E 



d(r i; e) 



The weighted transposition distance can be used to model 
similarities of objects in rankings wherein transposing two 
similar items induces a smaller distance than transposing two 
dissimilar items lfT2l . 

Remark 23. Note that the generalization of Kemeny's axioms 
may also be applied to arrive at a generalization of Borda's 
score-based rule. A step in this direction was proposed by 
Young [13 1, who showed that a set of axioms leads to a 
generalization of Borda's rule wherein the fcth preference 
of each ranking receives a score Sfe, not necessarily equal 
to k. This generalization of Borda's rule may also be used 
to address the problem of top versus bottom in rankings. 
In particular, one may assign Borda scores s^ to the fcth 
preference with 



Sk 



fc-i 



where <p^ is decreasing in I, For example, swapping two 
elements at the top of the ranking of a given voter changes 
the scores of each of the two corresponding objects by <j>\ 
while a similar swap at the bottom of the ranking, changes 
the scores by <$> n -i- Since <\>\ > <fi n -i, changes to the top 
of the list, in general, have a more significant affect on the 
aggregate ranking. 

V. Distributed Vote Aggregation 

The novel distance metrics, scoring methods and under- 
lying rank aggregation problems discussed in the previous 
sections may be viewed as instances rank aggregation of 
m agents over a fully connected graph: i.e. every agent 
has access to the ranking of all other agents and hence, 
fixing the aggregation distance or scores and aggregation 
method (Kendall, Borda,...) and assuming infinite computa- 
tional power, each individual can find an aggregate ranking of 
the society. Thus, assuming the uniqueness of the aggregate 
ranking, agents come to a consensus over the aggregate 
ranking in one computational step. Nevertheless, one can 
consider the more general problem of reaching consensus 



about the aggregate ranking in an arbitrary network through 
local interactions. In this section, we consider this problem 
over general networks and provide an analysis of convergence 
for a specific choice of aggregation method: i.e. the Borda 
aggregation method. The analysis of aggregation methods 
for some other distance measures described in the paper is 
postponed to the full version of the paper. 

Let G = ([m], E) be a connected undirected graph over m 
vertices with the edge set E that represents the connectivity 
pattern of agents in a networlo As before, we assume that 
each agent i G [m] has a ranking Oi over n entities. There are 
multiple ways of distributed aggregation of opinion in such 
a network, all of which are recursive schemes. 

One way to perform distributed aggregation is through 
neighbor aggregation. In this method, at discrete-time in- 
stances t = 0, 1, . . ., each agent maintains an estimate TTi(t) 
of the aggregate ranking. At time t, each agent exchange its 
believe with his neighboring agents. Then, at time t+1, agent 
i sets its believe 7ti(t+ 1) to be the aggregate ranking of all 
the estimates of the neighboring agents at time t, including 
his own aggregation. 

Another way to do distributed aggregation is through 
gossiping over networks [14|. Suppose that at each time 
instance we pick an edge {i,j} G E with probability pij > 0. 
Then, agents i and j exchange their estimates iii (t) and ttj (t) 
at time t and they both let 7Tj(i + 1) = nj(t + 1) be the 
aggregation of TVi(t) and Ttj(t). 

A. Gossiping Borda Vectors 

We describe next a distributed method using the Borda's 
scheme and gossiping over networks. Let b{ = 6j(0) be the 
vector of the initial rankings of n entities for agent i (for 
Borda's method we have the specific choice of &,; = w^). 
The goal is to compute b — — Y^iLi bi(0). One immediate 
solution to find b is through gossiping over the network as 
described by the following algorithm: 

Distributed Rank Aggregation: 

1) At time t > 0, pick an edge {i,i'} G E with probability 
Pui > where J2{i,i'}&E P a' = 1 ' 

2) Let i, i' exchange their estimate bi(t),bi'(t) and let 
bi(t + 1) = bi> (* + 1) = \{h{i) + b v {t)), 

3) For £ ^ i,i', let bt(t + l) = b e {t). 

As proven in [15|, the above scheme approaches the average 
as t goes to infinity. 

Lemma 24. If G = (Jm],E) is connected, then we almost 
surely have lim^oo 6j(t) = b. 

Proof: The lemma is direct consequence of the results 
in Q3) . ■ 

Note that in the distributed rank aggregation algorithm 
the ultimate goal is to find the correct ordering of b = 
— J2"Li bi(0) rather than the vector b itself. Thus, it is not 
important that the estimates of the ranking vectors converges 
to b, but that the estimates of the actual ranks are correct. In 

2 Many of the discussions in this section can be generalized for the case 
of time-varying networks 



other words, if for some time t, for all agents i, the ordering 
of bi(t) matches the ordering of b for all i £ [to], then the 
society has already achieved consensus over the ranking of 
the objects. Here, we derive a probabilistic bound on the 
number of iterations needed to probabilistically reach the 
optimum ranking. 

Throughout the following discussions, without loss of 
generality we may assume that b is ordereqj, i.e. b 1 < b 2 < 
■ ■ ■ <b n . We say that t is a consensus time for the aggregate 
ranking if the ordering of bi (t) matches the ordering of b for 
all i £ [to]. The following result follows immediately from 
this definition: 

Lemma 25. If t is a consensus time for the ranking, then 
any t' > t is a consensus time for the ranking. 

Proof: It suffice to show the result for t' = t + 1. Let 
{«, i'} be the edge that is chosen randomly at time t. Since t 
is a consensus time for the ranking, we have b\ (£)<•••< 
b?(t) and b\,{t) <■■■< b$(t), and thus we also have 

b\{t + l) = \(b\{t) + hl A t )) 

= \ (&?(*) + &?(*)), 

which proves the claim. ■ 

Based on the lemma above, let us define the consensus 
time T for the ordering to be: 

T = min{t > | t is a consensus time for the ordering.} 

Note that for the random gossip scheme, T is a ran- 
dom variable and if we have an adapted process for the 
random choice of edges, T is a stopping time. Our goal 
is to provide a probabilistic bound for T. For this, let 
r j = min {b> +1 - &>', V - fr^ 1 } and let dP = max, b{(0) - 
mini b\ (0). That is, r,j is the minimum distance of the average 
rating of j from the neighboring objects and dj is the spread 
of the initial ratings of the agents for the object j. Then, we 
have the following result. 

Theorem 26. For the consensus time T of the ordering we 
have 

n /,j\2 

P(T>t)<4m\UW)J2(- 

3=1 ^ r 

where W = J2{i,i>}eE P "' i 1 ~ !( e * ~ e i')( e i ~ Zi') T )> 
Cj = [0 • • • 10 • • • 0] T is an m x 1 vector with ith element 
equal to one, and \2(W) is the second largest eigenvalue of 
W. 

Proof: Let b> (t) be the vector obtained by the rating of 
the to agents at time t for object j and let yi (t) = V (t) — b' J . 

Note that if \\y j {t)\\ 2 < (jf\ , then this means that \b{{t) - 

3 Throughout this section we use superscript to denote the ranking of 
objects. 



b>\ < §■ for alii Thus, if for all j £ [n] we have ||?/'(t)|| 2 < 



(*)' 



, then it follows that: 



m<v+ r -^<\{v+v+ i ), 



where the last inequality follows from the fact that 7 J 
6 J+1 — V . Similarly, we have: 



< 



r j+i i 

~ 2 V 



bj +1 (t)>V+ l - 



V), 



which follows from r J+1 < b> +l — W . Hence, we have 
b\{t)<\{tf + V)<b\{t)<\(b* + l 2 ) 



< 



< \H- 



b n ) < b?(t), 



and so t is a consensus time for the algorithm. Thus, 

{T>t}c(J|||^(t)f>^' 
and hence, using the union bound, we obtain 

P{T>t)<J2P[\W(t)\\ 2 

3 = 1 V 

Markov's inequality implies that 

2\ 




(9) 



P[\\y J (t)\f 




E[\\y j (t)f 



Using the analysis in lfl31 . it can be shown that 

^[ll2/ J WH 2 ]<A*(^)||y'''(0)|| 2 <m(^) 2 . 
Combining the above two relations, we find 



P[\\v j (t)\ 



> 



y)V"4r 



Replacing the last inequality in (O, proves the assertion. ■ 
Note that from fl5l . if G is connected, then we have A2 < 
1 and thus the probability P(T > t) decays exponentially. 
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